Large Dataset Compression Approach Using Intelligent Technique

نویسندگان

  • Ahmed Tariq Sadiq
  • Mehdi G. Duaimi
  • Rasha Subhi Ali
چکیده

Data clustering is a process of putting similar data into groups. A clustering algorithms partition data set into several groups such that the similarity within a group is larger than among groups. Association rule is one of the possible methods for analysis of data. The association rules algorithm generates a huge number of association rules, of which many are redundant. The main idea of this paper is to compress large database by using clustering techniques with association rule algorithms. In the first stage, the database is compressed by using clustering techniques followed by association rules algorithm. Adaptive k-means clustering algorithm is proposed with apriori algorithm. Due to many experiments by using the adaptive k-means algorithm and apriori algorithm together it gives better compression ratio and smaller compressed file size than the compression ratio and compressed file size that are given from using each algorithm alone. Several experiments were made in several different sizes of database. The apriori algorithm increases the compression ratio of the adaptive kmeans algorithm when hey are used together but it takes more compression time than the adaptive kmeans takes. These algorithms are presented and their results are compared.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation of VlSI Based Image Compression Approach on Reconfigurable Computing System - A Survey

Image data require huge amounts of disk space and large bandwidths for transmission. Hence, imagecompression is necessary to reduce the amount of data required to represent a digital image. Thereforean efficient technique for image compression is highly pushed to demand. Although, lots of compressiontechniques are available, but the technique which is faster, memory efficient and simple, surely...

متن کامل

Intelligent scalable image watermarking robust against progressive DWT-based compression using genetic algorithms

Image watermarking refers to the process of embedding an authentication message, called watermark, into the host image to uniquely identify the ownership. In this paper a novel, intelligent, scalable, robust wavelet-based watermarking approach is proposed. The proposed approach employs a genetic algorithm to find nearly optimal positions to insert watermark. The embedding positions coded as chr...

متن کامل

Feature Extraction and Efficiency Comparison Using Dimension Reduction Methods in Sentiment Analysis Context

Nowadays, users can share their ideas and opinions with widespread access to the Internet and especially social networks. On the other hand, the analysis of people's feelings and ideas can play a significant role in the decision making of organizations and producers. Hence, sentiment analysis or opinion mining is an important field in natural language processing. One of the most common ways to ...

متن کامل

Diagnosis of Diabetes Using an Intelligent Approach Based on Bi-Level Dimensionality Reduction and Classification Algorithms

Objective: Diabetes is one of the most common metabolic diseases. Earlier diagnosis of diabetes and treatment of hyperglycemia and related metabolic abnormalities is of vital importance. Diagnosis of diabetes via proper interpretation of the diabetes data is an important classification problem. Classification systems help the clinicians to predict the risk factors that cause the diabetes or pre...

متن کامل

An Intelligent System’s Approach for Revitalization of Brown Fields using only Production Rate Data

State-of-the-art data analysis in production allows engineers to characterize reservoirs using production data. This saves companies large sums that should otherwise be spend on well testing and reservoir simulation and modeling. There are two shortcomings with today’s production data analysis: It needs bottom-hole or well-head pressure data in addition to data for rating reservoirs’ characteri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013